Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ANTLR SQLite grammar for batch splitting #1729

Closed

Conversation

itowlson
Copy link
Collaborator

@itowlson itowlson commented Sep 1, 2023

This addresses a user issue where SQLite extensions supported by the libsql vendor could not be used in spin up --sqlite @file. It aligns our statement splitter with the one used by libsql-go, which is probably as official as we're going to get.

There are possible maintenance implications if we ever need to update the grammar; I have documented the process but it is very much a manual one.

As noted, the goal here is to align with libsql-go. To do so, I have followed the Go implementation as closely as the borrow checker will let me. This means the code is (to my eyes) not very idiomatic Rust, and (to Clippy's eyes) an abomination before heaven. We could make an idiomatic implementation, but then it would be harder to follow if we ever needed to port an update from the Go implementation. Maybe it's not so huge that it would be a concern, but my understanding of the original was not great, so I chose to keep it as similar as I could - open to feedback for sure, but I'd prefer that to be at the level of "to what extent do we want to align" rather than pointing out individual un-Rustacean bits!

The huge sqlite_lexer file is ANTLR-generated; the source is documented in the mod.rs comments.

It would be good to get some samples for tests, since the maintenance of this is now on us.

@itowlson itowlson force-pushed the i-cant-believe-its-not-sql branch from b1fbe7d to 51c8a45 Compare September 1, 2023 01:29
@itowlson
Copy link
Collaborator Author

itowlson commented Sep 1, 2023

Oh, heck, even though we only use the output (and it builds fine), it seems like the ANTLR runtime has a custom build command that is making the e2e tests sad...

@itowlson
Copy link
Collaborator Author

itowlson commented Sep 1, 2023

Is it... is it trying to regenerate its test grammars? https://github.com/rrevenantt/antlr4rust/blob/master/build.rs

But why, then, would it build in, er, build, but not in e2e?!

Signed-off-by: itowlson <ivan.towlson@fermyon.com>
@itowlson itowlson force-pushed the i-cant-believe-its-not-sql branch from 51c8a45 to 191f798 Compare September 1, 2023 03:07
@itowlson
Copy link
Collaborator Author

itowlson commented Sep 1, 2023

Ah, there's a beta on crates.io. Let's see if that's better behaved than <random-sha>.

@rylev
Copy link
Collaborator

rylev commented Sep 14, 2023

@itowlson this PR scares me - I'd really rather not have to maintain this if possible. Perhaps we're going about this the wrong way, and instead we should be exploring ways we can avoid having to split strings into multiple SQL statements. Perhaps there's a way we can just send the entire string as one thing and let any downstream processes do the splitting if they need to. I realize this might require asking for features in projects we don't control, but perhaps that is a more sure first step?

@lann
Copy link
Collaborator

lann commented Sep 14, 2023

We could parse the statements with libsqlite3-sys... 😬

@itowlson
Copy link
Collaborator Author

@rylev That would be vastly preferable. The Go libsql client already accepts multiple statements in a single string, in order to be consistent with the Go SQL interface; if we can persuade libsql to have a single-string function in the Rust client too, then this whole problem goes away.

@itowlson
Copy link
Collaborator Author

@rylev To be clear re "ways we can avoid having to split strings into multiple SQL statements"... We are splitting because somewhere at some deep level (I think sqld / hrana, but I didn't keep adequate notes when diving into this, sorry - if you follow the link in the code it should take you to the starting point was the starting point for me) there is an interface (hrana BatchRequest I think) which accepts an array of statements, and rejects strings that contain multiple statements. We don't see this in our own sqld interactions because they're written in Go, and Go (in order to conform to its standard SQL API) disguises this by splitting internally before handing off to the database layer.

So as far as I can tell "avoid having to split" means either:

  • Getting the server to allow composites in a standard request
  • Doing the splitting inside the libsql client where we don't have to look at it or maintain it (the Go solution)

Not sure of any other options but very much open to anything that makes this not our problem!

@lann
Copy link
Collaborator

lann commented Sep 15, 2023

This seems to work: #1772

@itowlson
Copy link
Collaborator Author

Closing in favour of the frankly less alarming #1772.

@itowlson itowlson closed this Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants